AITopics | dorottya demszky

Collaborating Authors

dorottya demszky

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Measuring Teaching with LLMs

Hardy, Michael

arXiv.org Artificial IntelligenceNov-7-2025

Objective and scalable measurement of teaching quality is a persistent challenge in education. While Large Language Models (LLMs) offer potential, general-purpose models have struggled to reliably apply complex, authentic classroom observation instruments. This paper uses custom LLMs built on sentence-level embeddings, an architecture better suited for the long-form, interpretive nature of classroom transcripts than conventional subword tokenization. We systematically evaluate five different sentence embeddings under a data-efficient training regime designed to prevent overfitting. Our results demonstrate that these specialized models can achieve human-level and even super-human performance with expert human ratings above 0.65 and surpassing the average human-human rater correlation. Further, through analysis of annotation context windows, we find that more advanced models-those better aligned with human judgments-attribute a larger share of score variation to lesson-level features rather than isolated utterances, challenging the sufficiency of single-turn annotation paradigms. Finally, to assess external validity, we find that aggregate model scores align with teacher value-added measures, indicating they are capturing features relevant to student learning. However, this trend does not hold at the individual item level, suggesting that while the models learn useful signals, they have not yet achieved full generalization. This work establishes a viable and powerful new methodology for AI-driven instructional measurement, offering a path toward providing scalable, reliable, and valid feedback for educator development.

correlation, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2510.22968

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry: Education > Educational Setting (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

The Promises and Pitfalls of Using Language Models to Measure Instruction Quality in Education

Xu, Paiheng, Liu, Jing, Jones, Nathan, Cohen, Julie, Ai, Wei

arXiv.org Artificial IntelligenceApr-3-2024

Assessing instruction quality is a fundamental component of any improvement efforts in the education system. However, traditional manual assessments are expensive, subjective, and heavily dependent on observers' expertise and idiosyncratic factors, preventing teachers from getting timely and frequent feedback. Different from prior research that mostly focuses on low-inference instructional practices on a singular basis, this paper presents the first study that leverages Natural Language Processing (NLP) techniques to assess multiple high-inference instructional practices in two distinct educational settings: in-person K-12 classrooms and simulated performance tasks for pre-service teachers. This is also the first study that applies NLP to measure a teaching practice that is widely acknowledged to be particularly effective for students with special needs. We confront two challenges inherent in NLP-based instructional analysis, including noisy and long input data and highly skewed distributions of human ratings. Our results suggest that pretrained Language Models (PLMs) demonstrate performances comparable to the agreement level of human raters for variables that are more discrete and require lower inference, but their efficacy diminishes with more complex teaching practices. Interestingly, using only teachers' utterances as input yields strong results for student-centered variables, alleviating common concerns over the difficulty of collecting and transcribing high-quality student speech data in in-person teaching settings. Our findings highlight both the potential and the limitations of current NLP techniques in the education domain, opening avenues for further exploration.

dataset, simse dataset, student, (15 more...)

arXiv.org Artificial Intelligence

2404.02444

Country:

North America > United States > Maryland (0.04)
North America > Canada > Ontario > Toronto (0.04)
Africa > Middle East > Morocco (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry:

Education > Educational Setting > Online (0.93)
Education > Curriculum > Subject-Specific Education (0.68)
Education > Educational Setting > K-12 Education (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)

Add feedback

Edu-ConvoKit: An Open-Source Library for Education Conversation Data

Wang, Rose E., Demszky, Dorottya

arXiv.org Artificial IntelligenceFeb-7-2024

We introduce Edu-ConvoKit, an open-source library designed to handle pre-processing, annotation and analysis of conversation data in education. Resources for analyzing education conversation data are scarce, making the research challenging to perform and therefore hard to access. We address these challenges with Edu-ConvoKit. Edu-ConvoKit is open-source (https://github.com/stanfordnlp/edu-convokit ), pip-installable (https://pypi.org/project/edu-convokit/ ), with comprehensive documentation (https://edu-convokit.readthedocs.io/en/latest/ ). Our demo video is available at: https://youtu.be/zdcI839vAko?si=h9qlnl76ucSuXb8- . We include additional resources, such as Colab applications of Edu-ConvoKit to three diverse education datasets and a repository of Edu-ConvoKit related papers, that can be found in our GitHub repository.

dataset, dorottya demszky, edu-convokit, (15 more...)

arXiv.org Artificial Intelligence

2402.05111

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.05)
North America > Canada > Ontario > Toronto (0.05)
North America > United States > Virginia > Fairfax County > Reston (0.04)

Genre:

Instructional Material (1.00)
Research Report > Experimental Study (0.46)

Industry: Education > Educational Setting (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Software (0.91)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback